File Hash calculation
The File Hash acts as a digital fingerprin that uniquely identifies each file. During processing, by default, Epiq Discover calculates the File Hash using one of the following methods and populates the File Hash field.
-
When email or ICS (Calendar) information is available, Epiq Discover calculates the File Hash using the following properties:
-
Date
-
To (count)
-
From
-
CC (count)
-
BCC (count)
-
Subject
-
Body Text
-
File hash for email and calendar items
-
-
When email or ICS information is not available, Epiq Discover calculates the binary hash. For Microsoft Teams chat MSG files, the system calculates the binary hash.
-
When a File Hash is not provided in the DAT file during document loading through Desktop Client, the system calculates the File Hash using the raw file bytes. However, when you reprocess these files, the system recalculates the File Hash using the methods described above. As a result, the File Hash value may differ between the loading and reprocessing stages.
Custom File Hash calculation
For ICS files and email files, such as MSG and EML, you can specify the fields used for File Hash calculation. This feature enables the deduplication of identical emails collected and processed from multiple platforms, such as an email archive and an active email server. The system does not support Custom File Hash calculation for the following documents.
-
Parent ICS files that are collected using Collect>Microsoft 365
-
Bloomberg email files
-
Microsoft Teams chat MSG files
By default this feature is disabled. You can enable it in Project Settings. When you apply Custom File Hash calculation, the system populates the File Hash and File Hash – Original fields. Then, the File Hash field displays the Custom File Hash value and the File Hash – Original field displays the default File Hash value. When you do not apply Custom File Hash calculation, the system populates only the File Hash field.
The following list provides related topics.
-
To enable Custom File Hash calculation, refer to Modify Custom File Hash calculation setting.
-
For more information about how the Processing Information field is coded for the File Hash calculation, refer to Processing Information.